SCARF: a segmental conditional random field toolkit for speech recognition

نویسندگان

  • Geoffrey Zweig
  • Patrick Nguyen
چکیده

This paper describes a new toolkit SCARF for doing speech recognition with segmental conditional random fields. It is designed to allow for the integration of numerous, possibly redundant segment level acoustic features, along with a complete language model, in a coherent speech recognition framework. SCARF performs a segmental analysis, where each segment corresponds to a word, thus allowing for the incorporation of acoustic features defined at the phoneme, multi-phone, syllable and word level. SCARF is designed to make it especially convenient to use acoustic detection events as input, such as the detection of energy bursts, phonemes, or other events. Language modeling is done by associating each state in the SCRF with a state in an underlying n-gram language model, and SCARF supports the joint and discriminative training of language model and acoustic model parameters. SCARF is available for download from http://research.microsoft.com/en-us/projects/scarf/

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SCARF: A Segmental CRF Speech Recognition System

We propose a theoretical framework for doing speech recognition with segmental conditional random fields, and describe the impleme-nation of a toolkit for experimenting with these models. This framework allows users to easily incorporate multiple detector streams into a discriminatively trained direct model for large vocabulary continuous speech recognition. The detector streams can operate at ...

متن کامل

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

The use of Deep Belief Networks (DBN) to pretrain Neural Networks has recently led to a resurgence in the use of Artificial Neural Network Hidden Markov Model (ANN/HMM) hybrid systems for Automatic Speech Recognition (ASR). In this paper we report results of a DBN-pretrained context-dependent ANN/HMM system trained on two datasets that are much larger than any reported previously with DBN-pretr...

متن کامل

A comparison of training approaches for discriminative segmental models

Segmental models such as segmental conditional random fields have had some recent success in lattice rescoring for speech recognition. They provide a flexible framework for incorporating a wide range of features across different levels of units, such as phones and words. However, such models have mainly been trained by maximizing conditional likelihood, which may not be the best proxy for the t...

متن کامل

NNVLP: A Neural Network-Based Vietnamese Language Processing Toolkit

This paper demonstrates neural networkbased toolkit namely NNVLP for essential Vietnamese language processing tasks including part-of-speech (POS) tagging, chunking, named entity recognition (NER). Our toolkit is a combination of bidirectional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Network (CNN), Conditional Random Field (CRF), using pre-trained word embeddings as input, which a...

متن کامل

Deep segmental neural networks for speech recognition

Hybrid systems which integrate the deep neural network (DNN) and hidden Markov model (HMM) have recently achieved remarkable performance in many large vocabulary speech recognition tasks. These systems, however, remain to rely on the HMM and assume the acoustic scores for the (windowed) frames are independent given the state, suffering from the same difficulty as in the previous GMM-HMM systems...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010